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Abstract. In this paper we present a modeling methodology for BPMN, 
the standard notation for the representation of business processes. Our 
methodology simplifies the development of collaborative BPMN dia- 
grams, enabling the automated creation of skeleton process diagrams 
representing complex choreographies. To evaluate and tune the method- 
ology, we have developed a tool supporting it, that we apply to the 
modeling of an international patenting process as a working example. 
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1 Introduction 

While it is typical to deal with business processes that start and end inside a sin- 
gle organization, many processes are not constrained inside the walls of a single 
company. For example, supply chains can be seen as large processes involving 
suppliers, manufacturers, distributors and retailers. Unfortunately, processes in- 
volving many actors may easily become very complex, and this complexity may 
be inherited by the tools (languages /not at ions and editors) used to design or 
document them. 

BPMN is the OMG standard notation for the representation of business pro- 
cesses [I}. Even if BPMN constructs are very intuitive, large business process 
diagrams involving both collaborations between several actors and details on 
the single participants can be very hard to design or document without any as- 
sociated methodology — much like building a miniature ship model with glue 
and screws, but without instructions. In this paper we introduce a methodology 
for the translation of informal process descriptions into BPMN diagrams, and 
a software design tool supporting it. In particular, we provide a tool to auto- 
matically generate diagrams showing the collaborative aspects of the process, 
starting from annotated textual requirements. 

Why do we need a specific methodology for BPMN? This lies in the very 
nature of this language, which enables the representation of three important 
aspects of business processes, making it a unique modeling tool. First, we can 
represent a choreography of processes, i.e., how different processes interact with 
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each other to fulfill a common objective. Second, we can represent the orches- 
tration of a process, i.e., its internal organization into sequences of activities. 
Third, BPMN allows the representation of the same information at different lev- 
els of detail, using sub-processes — this being fundamental to provide different 
views of the same process to people with different roles, like top managers and 
technical staff. 

In summary, BPMN enables the representation of complex scenarios be- 
cause it can include many different aspects into a single diagram: choreography, 
orchestration, and data, at several different levels of abstraction [TJ[2]. Therefore, 
the main idea behind our approach is that the initial requirements can be split 
into different classes, that can be specifically addressed during well separated 
and thus simplified modeling steps. As we have illustrated in Figure [IJ after a 
typical pre-processing of the available informal requirements aimed at removing 
ambiguities and producing a dictionary with all definitions and synonyms, we 
split them into atomic statements referring to one of the following aspects: (D) 
data, (I) interactions between different participants, and (L) local work of a sin- 
gle participant. At this point, each class can be processed independently from 
the others. 
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Fig. 1. A diagram summarizing the proposed methodology 



To the best of our knowledge, this is the first presentation of a methodology 
for BPMN modeling, and in particular of a methodology to automate the drawing 
of the collaborative portion of distributed processes. Obviously, it is based on 
best practices taken from existing data and software modeling methodologies like 
the IBM Rational Unified Process (www-01 . ibm. com/software/awdtools/rup), 
the IDEF methods (http : //www . idef . com), data modeling using ER diagrams 
and Object Process Modeling (http://www.objectprocess.org). The POEM 
(Process Oriented Enterprise Modelling) methodology uses BPMN as one of 
several basic diagram types. Although we are not aware of existing presentations 
of this methodology, still under development, it seems to have a wider scope 
than our proposal, covering several additional aspects of an enterprise, while we 
present specific results regarding the BPMN notation. 
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The design tool developed to support our methodology is a plug-in for Mi- 
crosoft Visioj, and uses an extension of the POEM stencilfl The tool can be 
used to design BPMN diagrams, to annotate them with additional attributes 
(like the cost of activities) and to generate their XPDL representations [U [5] . 
A time limited beta version of the tool can be downloaded from the Web site 
http://bpm.cs.unibo.it 



1.1 An Overview of the Methodology: Main Phases 

First, we need to split the requirements into small atomic statements, and assign 
each statement to one of the three aforementioned classes. The user interface 
to import, edit and annotate the requirements is illustrated in Figure O After 
the identification of data and participants, that can be tagged directly on the 
imported text using our tool, the assignment will usually be straightforward: 

- If the statement concerns only data, then it will be a Data requirement. 

- If the statement contains one single participant (and describes one or more 
actions), then it will be a Local requirement. 

- If the statement refers to two participants, this will indicate an Interaction 
requirement, and we should also be able to identify the exchanged data. 
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Fig. 2. Requirements managed using our design tool 



Data modeling We can now start modeling the data (D), which is a primary 
component of real business processes and will thus drive all the modeling activ- 
ities. In fact, a process is basically a sequence of activities aimed at modifying 
some data or objects, and the production of new data is the way in which busi- 
ness processes generate value — for example, many business processes are used 
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to transform raw materials into final products. As BPMN offers a limited support 
to the modeling of data, being more activity/event-oriented, these requirements 
can be addressed by existing data modeling approaches and attached to the dia- 
grams as complementary documentation. In our tool we use an extended version 
of BPMN, obtained by including some features of ER and data flow diagrams 
[6, 3, 7J. However, this is not required by our methodology, which applies to stan- 
dard BPMN diagrams, therefore we will not present the details of our extension 
in this work. 

After this modeling step, we will have identified a list of all the data/objects 
referenced in the requirements — the next step will be the definition of how they 
are exchanged among different participants. 



Process report 
. H ; 




Fig. 3. The skeleton diagram corresponding to the statement: The report shall, as soon 
as it has been established, be transmitted by the external consultant to the CIO. This 
diagram has been generated automatically, starting from this annotated statement 



Interaction modeling Data flows will then be used to generate a so called 
skeleton diagram representing how data is exchanged between the participants 
to produce the final products of the process. Basically, during this step we focus 
on choreography, i.e., we identify all the participants and their interactions (I). 
Each participant is represented using a BPMN pool, and we draw a message flow 
between two of them for each requirement. In this way, after having identified 
all interaction requirements we can automatically build a skeleton diagram, as 
we have exemplified in Figure [3] each interaction between participants A and B 
corresponds to a Send Data activity in A, a Receive Data activity in B, and a 
Process Data sub-process in B, indicating that the received data will be later 
manipulated — this will be expanded during the next step of the methodology. 
Notice that in this paper we do not deal with the verification of choreographies, 
but with the formalization of existing informal descriptions of a choreography 
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- automated verification tools will obviously be of great utility to check the 
designed diagram, but this is an orthogonal problem with respect to the scope 
of this paper. 

Local modeling Now, as we exemplify in Section [2j we will have a skeleton 
diagram with all the participants (pools) and all messages exchanged between 
them, representing the complete (abstract) data paths used to produce the final 
outcome of the process, be it a document, a product, or any other valuable ob- 
ject. For each exchanged message, we will also have a sub-process (the rectangle 
with a small plus sign represented in Figure [3]) hiding the local activities per- 
formed by the participant to manipulate the data. Therefore, we can focus on the 
remaining requirements (L) describing these activities. This modeling step can 
be performed in a top-down way, following the philosophy behind BPMN which 
uses abstraction levels as a basic tool to provide different views on the same 
process. Therefore, L-statements will be associated to specific sub-processes, hi- 
erarchically organized, and finally added to the diagram. For example, consider 
the following statements: 

1. The CIO shall store a copy of the report into the archive, and 

2. prepare an IT plan as follows: 

a) first, collect information on all the systems currently used in the com- 
pany, 

b) then evaluate their life cycle state (trailing, leading or bleeding edge). 

These can be modeled as illustrated in Figure 01 where we have also used a store 
(the archive) that is one of our data modeling extensions, borrowed from Data 
Flow Diagrams [3], and whose meaning should be intuitively evident. Later, 
the Prepare IT plan sub-process can be expanded including the statements 
describing this activity (2. a, 2.b and following, in this example), and so on re- 
cursively. 
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Fig. 4. The modeling of local (L) requirements, in a top-down fashion 
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2 Working Example 

In this section we apply our methodology to a summarized version of the process 
described in the Patent Cooperation TreatjH, providing instructions on how to 
file an international application for a patent that can be then submitted to the 
patent offices of some of the States covered by the treaty. 

2.1 Participants, Data and Glossary 

We start our example by assuming to work on a single text file. This file can 
be processed as usual, by clarifying unclear sentences, identifying synonyms, 
replacing them with consistent terms, and building a technical dictionary. 

Then, we may identify all the data objects mentioned in the requirements, 
and all the participants. In Figure [5] we have represented the requirements de- 
scribing this example, already formatted according to this preliminary analysis. 
In particular, we have identified and underlined all data objects: 

- application, 

- international application, 

- request, 

- home copy, 

- record copy, 

- search copy, 

- international search report, and 

- translation. 

In addition, we have identified and boxed all participants: 

- applicant, 

- receiving office, 

- International Bureau, 

- Internation Searching Authority, and 

- designated office. 

At this point, we may easily proceed by splitting the requirements into three 
groups. This can be done using our tool, as we have illustrated in Figure [2] on 
the examples illustrated in the first part of the paper, where we can import the 
requirements, split them into statements and assign each statement to the data 
class, to the interaction class (statements where two participants are involved) 
and to the local class (statements describing the activity of a single participant). 
Notice that we can also group several L-statements together, assigning a name 
to the group, and do so recursively — in this way we can support a top-down 
modeling approach for the orchestration step, discussed later. The next three 
activities will consist in the manipulation of these three sets of requirements. 
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Applications for the protection of inventions in any of the Contracting States may 



be filed as international applications. An international application shall con- 
tain a request, a description, one or more claims, one or more drawings (where 
required), and an abstract. The request shall contain: the designation of the 
Contracting States in which protection for the invention is desired; the name 
of and other prescribed data concerning the applicant ; the title of the in- 
vention; the name of and other prescribed data concerning the inventor. The 
receiving Office shall accord as the international filing date the date of receipt 



of the international application, provided that it has verified some basic require- 



ments. If the 



receiving Office finds that the international application did not, 



at the time of receipt, fulfill these requirements, it shall invite the applicant 



to 



file the required correction. If the applicant complies with the invitation, the 



receiving Office shall accord as the international filing date the date of receipt of the 



required correction . One copy of the international application shall be kept by the 



receiving Office (home copy), one copy (record copy) shall be transmitted to 

and another copy (search copy) shall be transmitted 
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after the expiration of 18 months from the priority date of that application. 
The applicant shall furnish a copy of the international search report, of the 



international application and a translation thereof, and pay the national fee, to 



designated Office not later than at the expiration of 30 months from the 



each 



priority date. 



Fig. 5. Working example: international patent application filing procedure 



2.2 Data 

First, we model the data that will be manipulated by the process. As we have 
aforementioned, we do not provide details of this phase, that is performed ac- 
cording to existing data modeling methodologies. For example, an international 
application (one of the data objects in the working example) is composed of 
several parts, like a request and a description, that can be visualized on the final 
diagrams using our tool or illustrated in other diagrams using specific notations. 
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2.3 Choreography 

Now, we can start describing the dynamics of the data, focusing on data exchange 
between different participants. Using the approach described in the introduction, 
we can translate each I-statement into a triple of activities, generating the di- 
agram represented in Figure [6l This diagram, which derives directly from the 
isolated statements and can be thus easily produced semi-automatically, clearly 
describes the main data paths and gives a first, high level view of the process. 

2.4 Orchestration 

At this point, we can refine the skeleton diagram including the details of each 
participant (pool) in a top-down way, associating each local statement from 
the requirements to one specific sub-process. For instance, let us focus on the 
Process International Application sub-process in the Receiving Office 
poo0. There are three statements corresponding to this activity: 

- The receiving Office shall accord as the international filing date the date of 
receipt of the international application, 

- provided that it has verified some basic requirements. 

- the receiving Office shall accord as the international filing date the date of 
receipt of the required correction. 

The modeling of these requirements is extremely easy, and this derives from the 
fact that we can model separately each sub-process in the skeleton, and focus 
on a limited portion of the final diagram, thanks to our previous modeling steps 
and statement splitting. This can be repeated for each sub-process, leading to 
the definition of the complete diagram illustrated in Figure [3 

3 A Summary of the Methodology 

The methodology proposed in this paper is composed of the following main steps: 

1. Pre-process the initial requirements (remove ambiguities, update and refine 
unclear sentences and generate a dictionary with explanations of the techni- 
cal terms and indications of synonyms) . 

2. Split the requirements into elementary statements. 

3. Separate data (D) statements from activity statements. 

4. Identify the participants, and mark each statement as a local (L) activity 
(orchestration) or an interaction (I) among participants (choreography). 

5. Draw the skeleton of the process, modeling interaction activities. 

6. Tree-structure local activities, associate them to sub-processes in the skeleton 
diagram, and model them in a top-down way by increasing the level of detail 
at each iteration (if necessary) . 

4 We can also consider different pools as lanes of a single pool, but this is not relevant 
to our discussion 
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Fig. 6. Process Skeleton obtained using our tool. Also this diagram has been generated 
automatically, starting from the annotated textual description of the process 
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In this way, part of the modeling activity can be semi-automated, and the 
definition of the orchestration inside each pool can be performed by focusing 
on small portions of the initial requirements. In addition, each statement can 
be easily associated to a specific part of the final diagram, and it can be then 
verified if the diagram is complete, i.e., if it models all the requirements. 

To the best of our knowledge, this work describes the first methodology to 
support the modeling of collaborative business processes with BPMN, and has 
been developed focusing on the specific features of this notation. In addition, 
using the tool exemplified in this paper we have been able to evaluate this 
methodology and to appreciate the simplification that it enables, and that has 
been described and highlighted on the working example presented in this paper. 

References 

1. OMG: Business process modeling notation specification (2009) 

2. Barros, A., Dumas, M., Oaks, P.: Standards for web service choreography and 
orchestration: Status and perspectives. In: Workshop on Web Service Choreography 
and Orchestration for Business Process Management. Volume 3812 of LNCS. (2006) 

3. Gane, O, Sarson, T.: Structured Systems Analysis: Tools and Techniques. 1ST, Inc. 
(1977) 

4. Magnani, M., Montesi, D.: BPMN: How much does it cost? an incremental approach. 
In: Business Process Management (BPM), 5th International Conference. Volume 
4714 of LNCS. (2007) 

5. Oasis: Process definition interface - XML process definition language (XPDL) spec- 
ification (2005) 

6. Chen, P.: The Entity-Relationship model — toward a unified view of data. Trans- 
actions on Database Systems 1(1) (1976) 9-36 

7. Yourdon, N.E.: Just enough structured analysis, ch. 9: Dataflow diagrams (2006) 
www.yourdon.com. 



References 11 




Fig. 7. Final process 



